Indium Software | Data Engineer Interview Experience | 3-6 YoE



Technical Rounds:

Python Questions

🔹 Write a Tuple and List example.

🔹 Write a Tuple where only one string is present.

🔹 What is a generator in Python?

🔹 In Python, what is **kwargs?

🔹 How many arguments can we pass using **kwargs? Is there any limit?

Python Coding Question

🔹 Find the first non-repeating character using a dictionary.

Example Input: "abxabyz" → Output: x

SQL Questions

🔹 What is the execution order of an SQL query?

SQL Coding Question

🔹 A phone call is considered international when the caller and receiver are in different countries.

🔹 What percentage of phone calls are international? Round the result to 1 decimal.

Example Input:

phone_calls

phone_info

Example Output:

Hadoop and Hive Questions

🔹 What do you mean by schema on read and schema on write?

🔹 Which one does Hive follow?

🔹 What is the difference between Hadoop 1.0 and Hadoop 2.0?

PySpark Questions

🔹 What is lazy evaluation in Spark? What happens during that time?

🔹 What is data skewness?

🔹 What steps will you follow to remove data skewness?

🔹 You have a DataFrame with two columns: student_id (millions of unique IDs) and Students_joining_date (only 5–6 unique dates). If you apply partitioning, based on which column will the partition size be larger?